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ABSTRACT 

An array processor includes processing elements arranged 
in clusters which are, in turn, combined in a rectangular 
array. Each cluster is formed of processing elements which 
preferably communicate with the processing elements of at 
least two other clusters. Additionally each inter-cluster 
communication path is mutually exclusive, that is, each path 
carries either north and west, south and east, north and east, 
or south and west communications. Due to the mutual 
exclusivity of the data paths, communications between the 
processing elements of each cluster may be combined in a 
single inter-cluster path. That is, communications from a 
cluster which communicates to the north and east with another 
cluster may be combined in one path, thus eliminating half the 
wiring required for the path. Additionally, the length of the 
longest communication path is not directly determined by the 
overall dimension of the array, as it is in conventional torus 
arrays. Rather, the longest communications path is limited 
only by the inter-cluster spacing. In one implementation, 
transpose elements of an N x N torus are combined in clusters 
and communicate with one another through intra-cluster 
communications paths. Since transpose elements have direct 
connections to one another, transpose operation latency is 
eliminated in this approach. Additionally, each PE may have a 
single transmit port and a single receive port. As a result, 
the individual PEs are decoupled from the topology of the 
array . 


